Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 1074606 |
| Missing cells | 186888 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 131.2 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 4 |
time has a high cardinality: 43376 distinct values | High cardinality |
gameId is highly correlated with team | High correlation |
frameId is highly correlated with s and 1 other fields | High correlation |
s is highly correlated with dis | High correlation |
dis is highly correlated with s | High correlation |
team is highly correlated with gameId | High correlation |
nflId has 46722 (4.3%) missing values | Missing |
jerseyNumber has 46722 (4.3%) missing values | Missing |
o has 46722 (4.3%) missing values | Missing |
dir has 46722 (4.3%) missing values | Missing |
s has 64921 (6.0%) zeros | Zeros |
a has 60354 (5.6%) zeros | Zeros |
dis has 65343 (6.1%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-02 14:56:39.168171 |
|---|---|
| Analysis finished | 2022-11-02 14:58:15.968339 |
| Duration | 1 minute and 36.8 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2021099932 |
| Minimum | 2021093000 |
|---|---|
| Maximum | 2021100400 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 2021093000 |
|---|---|
| 5-th percentile | 2021093000 |
| Q1 | 2021100303 |
| median | 2021100307 |
| Q3 | 2021100311 |
| 95-th percentile | 2021100400 |
| Maximum | 2021100400 |
| Range | 7400 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 1625.232664 |
|---|---|
| Coefficient of variation (CV) | 8.041327587 × 10-7 |
| Kurtosis | 14.2437537 |
| Mean | 2021099932 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -4.029836525 |
| Sum | 2.171886113 × 1015 |
| Variance | 2641381.212 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2021100307 | 83605 | 7.8% |
| 2021100300 | 78407 | 7.3% |
| 2021100311 | 78177 | 7.3% |
| 2021100313 | 76613 | 7.1% |
| 2021100309 | 73048 | 6.8% |
| 2021100305 | 72657 | 6.8% |
| 2021100308 | 69391 | 6.5% |
| 2021100400 | 68103 | 6.3% |
| 2021100304 | 66378 | 6.2% |
| 2021100310 | 62445 | 5.8% |
| Other values (6) | 345782 |
| Value | Count | Frequency (%) |
| 2021093000 | 55982 | |
| 2021100300 | 78407 | |
| 2021100301 | 49036 | |
| 2021100302 | 60007 | |
| 2021100303 | 61410 | |
| 2021100304 | 66378 | |
| 2021100305 | 72657 | |
| 2021100306 | 59616 | |
| 2021100307 | 83605 | |
| 2021100308 | 69391 |
| Value | Count | Frequency (%) |
| 2021100400 | 68103 | |
| 2021100313 | 76613 | |
| 2021100312 | 59731 | |
| 2021100311 | 78177 | |
| 2021100310 | 62445 | |
| 2021100309 | 73048 | |
| 2021100308 | 69391 | |
| 2021100307 | 83605 | |
| 2021100306 | 59616 | |
| 2021100305 | 72657 |
playId
Real number (ℝ≥0)
| Distinct | 987 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2225.051625 |
| Minimum | 55 |
|---|---|
| Maximum | 5153 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 55 |
|---|---|
| 5-th percentile | 248 |
| Q1 | 1172 |
| median | 2252 |
| Q3 | 3312 |
| 95-th percentile | 4114 |
| Maximum | 5153 |
| Range | 5098 |
| Interquartile range (IQR) | 2140 |
Descriptive statistics
| Standard deviation | 1244.912922 |
|---|---|
| Coefficient of variation (CV) | 0.5594984443 |
| Kurtosis | -1.102648498 |
| Mean | 2225.051625 |
| Median Absolute Deviation (MAD) | 1074 |
| Skewness | 0.01012728136 |
| Sum | 2391053826 |
| Variance | 1549808.184 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2505 | 4922 | 0.5% |
| 198 | 3611 | 0.3% |
| 4138 | 3381 | 0.3% |
| 3206 | 3266 | 0.3% |
| 3331 | 2921 | 0.3% |
| 1522 | 2921 | 0.3% |
| 317 | 2852 | 0.3% |
| 2368 | 2806 | 0.3% |
| 2023 | 2783 | 0.3% |
| 76 | 2737 | 0.3% |
| Other values (977) | 1042406 |
| Value | Count | Frequency (%) |
| 55 | 1334 | |
| 56 | 667 | 0.1% |
| 59 | 1449 | |
| 76 | 2737 | |
| 78 | 897 | 0.1% |
| 80 | 667 | 0.1% |
| 81 | 690 | 0.1% |
| 83 | 1035 | 0.1% |
| 97 | 1449 | |
| 104 | 943 | 0.1% |
| Value | Count | Frequency (%) |
| 5153 | 1265 | |
| 5108 | 805 | |
| 5073 | 736 | |
| 5051 | 920 | |
| 5010 | 828 | |
| 4953 | 713 | |
| 4929 | 851 | |
| 4843 | 1035 | |
| 4824 | 1058 | |
| 4668 | 713 |
| Distinct | 1174 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 46722 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45665.46062 |
| Minimum | 25511 |
|---|---|
| Maximum | 54006 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 25511 |
|---|---|
| 5-th percentile | 37130 |
| Q1 | 42431 |
| median | 45215 |
| Q3 | 48089 |
| 95-th percentile | 53476 |
| Maximum | 54006 |
| Range | 28495 |
| Interquartile range (IQR) | 5658 |
Descriptive statistics
| Standard deviation | 5042.442404 |
|---|---|
| Coefficient of variation (CV) | 0.110421363 |
| Kurtosis | -0.01150870586 |
| Mean | 45665.46062 |
| Median Absolute Deviation (MAD) | 2788 |
| Skewness | -0.1946866213 |
| Sum | 4.693879633 × 1010 |
| Variance | 25426225.4 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 41243 | 2211 | 0.2% |
| 38538 | 2211 | 0.2% |
| 47865 | 2211 | 0.2% |
| 40124 | 2211 | 0.2% |
| 52566 | 2211 | 0.2% |
| 47881 | 2104 | 0.2% |
| 41237 | 2022 | 0.2% |
| 38629 | 2010 | 0.2% |
| 52461 | 2002 | 0.2% |
| 52553 | 2002 | 0.2% |
| Other values (1164) | 1006689 | |
| (Missing) | 46722 | 4.3% |
| Value | Count | Frequency (%) |
| 25511 | 1707 | |
| 28963 | 1253 | |
| 29550 | 776 | |
| 29851 | 1344 | |
| 30842 | 431 | < 0.1% |
| 30869 | 1623 | |
| 33084 | 1769 | |
| 33107 | 925 | |
| 33130 | 350 | < 0.1% |
| 33131 | 1296 |
| Value | Count | Frequency (%) |
| 54006 | 82 | < 0.1% |
| 53999 | 76 | < 0.1% |
| 53978 | 427 | |
| 53960 | 141 | < 0.1% |
| 53957 | 702 | |
| 53953 | 287 | |
| 53946 | 194 | < 0.1% |
| 53900 | 182 | < 0.1% |
| 53711 | 118 | < 0.1% |
| 53679 | 127 | < 0.1% |
| Distinct | 119 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.20987971 |
| Minimum | 1 |
|---|---|
| Maximum | 119 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 21 |
| Q3 | 32 |
| 95-th percentile | 51 |
| Maximum | 119 |
| Range | 118 |
| Interquartile range (IQR) | 21 |
Descriptive statistics
| Standard deviation | 15.50913708 |
|---|---|
| Coefficient of variation (CV) | 0.6682127296 |
| Kurtosis | 1.527209904 |
| Mean | 23.20987971 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 0.9521722351 |
| Sum | 24941476 |
| Variance | 240.5333329 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 25599 | 2.4% |
| 13 | 25599 | 2.4% |
| 22 | 25599 | 2.4% |
| 21 | 25599 | 2.4% |
| 20 | 25599 | 2.4% |
| 19 | 25599 | 2.4% |
| 18 | 25599 | 2.4% |
| 17 | 25599 | 2.4% |
| 16 | 25599 | 2.4% |
| 2 | 25599 | 2.4% |
| Other values (109) | 818616 |
| Value | Count | Frequency (%) |
| 1 | 25599 | |
| 2 | 25599 | |
| 3 | 25599 | |
| 4 | 25599 | |
| 5 | 25599 | |
| 6 | 25599 | |
| 7 | 25599 | |
| 8 | 25599 | |
| 9 | 25599 | |
| 10 | 25599 |
| Value | Count | Frequency (%) |
| 119 | 23 | |
| 118 | 23 | |
| 117 | 46 | |
| 116 | 46 | |
| 115 | 46 | |
| 114 | 46 | |
| 113 | 46 | |
| 112 | 46 | |
| 111 | 46 | |
| 110 | 46 |
| Distinct | 43376 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.2 MiB |
| 2021-10-03T19:38:04.100 | 92 |
|---|---|
| 2021-10-03T19:38:04.000 | 92 |
| 2021-10-03T19:38:03.900 | 92 |
| 2021-10-03T19:38:03.800 | 92 |
| 2021-10-03T19:38:03.700 | 92 |
| Other values (43371) |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 24715938 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2021-10-01T00:29:42.800 |
|---|---|
| 2nd row | 2021-10-01T00:29:42.900 |
| 3rd row | 2021-10-01T00:29:43.000 |
| 4th row | 2021-10-01T00:29:43.100 |
| 5th row | 2021-10-01T00:29:43.200 |
Common Values
| Value | Count | Frequency (%) |
| 2021-10-03T19:38:04.100 | 92 | < 0.1% |
| 2021-10-03T19:38:04.000 | 92 | < 0.1% |
| 2021-10-03T19:38:03.900 | 92 | < 0.1% |
| 2021-10-03T19:38:03.800 | 92 | < 0.1% |
| 2021-10-03T19:38:03.700 | 92 | < 0.1% |
| 2021-10-03T19:38:03.600 | 92 | < 0.1% |
| 2021-10-03T19:38:03.500 | 92 | < 0.1% |
| 2021-10-03T19:38:04.200 | 92 | < 0.1% |
| 2021-10-03T19:26:13.100 | 69 | < 0.1% |
| 2021-10-03T19:26:12.600 | 69 | < 0.1% |
| Other values (43366) | 1073732 |
Length
| Value | Count | Frequency (%) |
| 2021-10-03t19:38:04.100 | 92 | < 0.1% |
| 2021-10-03t19:38:03.900 | 92 | < 0.1% |
| 2021-10-03t19:38:03.800 | 92 | < 0.1% |
| 2021-10-03t19:38:03.700 | 92 | < 0.1% |
| 2021-10-03t19:38:03.600 | 92 | < 0.1% |
| 2021-10-03t19:38:03.500 | 92 | < 0.1% |
| 2021-10-03t19:38:04.200 | 92 | < 0.1% |
| 2021-10-03t19:38:04.000 | 92 | < 0.1% |
| 2021-10-03t19:48:30.100 | 69 | < 0.1% |
| 2021-10-03t19:24:19.000 | 69 | < 0.1% |
| Other values (43366) | 1073732 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 6423854 | |
| 1 | 3564287 | |
| 2 | 3293209 | |
| - | 2149212 | 8.7% |
| : | 2149212 | 8.7% |
| 3 | 1591876 | 6.4% |
| T | 1074606 | 4.3% |
| . | 1074606 | 4.3% |
| 4 | 793178 | 3.2% |
| 5 | 750582 | 3.0% |
| Other values (4) | 1851316 | 7.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 18268302 | |
| Other Punctuation | 3223818 | 13.0% |
| Dash Punctuation | 2149212 | 8.7% |
| Uppercase Letter | 1074606 | 4.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 6423854 | |
| 1 | 3564287 | |
| 2 | 3293209 | |
| 3 | 1591876 | 8.7% |
| 4 | 793178 | 4.3% |
| 5 | 750582 | 4.1% |
| 7 | 517500 | 2.8% |
| 9 | 507357 | 2.8% |
| 8 | 490429 | 2.7% |
| 6 | 336030 | 1.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 2149212 | |
| . | 1074606 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2149212 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 1074606 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 23641332 | |
| Latin | 1074606 | 4.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 6423854 | |
| 1 | 3564287 | |
| 2 | 3293209 | |
| - | 2149212 | 9.1% |
| : | 2149212 | 9.1% |
| 3 | 1591876 | 6.7% |
| . | 1074606 | 4.5% |
| 4 | 793178 | 3.4% |
| 5 | 750582 | 3.2% |
| 7 | 517500 | 2.2% |
| Other values (3) | 1333816 | 5.6% |
Latin
| Value | Count | Frequency (%) |
| T | 1074606 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 24715938 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 6423854 | |
| 1 | 3564287 | |
| 2 | 3293209 | |
| - | 2149212 | 8.7% |
| : | 2149212 | 8.7% |
| 3 | 1591876 | 6.4% |
| T | 1074606 | 4.3% |
| . | 1074606 | 4.3% |
| 4 | 793178 | 3.2% |
| 5 | 750582 | 3.0% |
| Other values (4) | 1851316 | 7.5% |
| Distinct | 99 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 46722 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50.08508548 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 23 |
| median | 52 |
| Q3 | 76 |
| 95-th percentile | 96 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 53 |
Descriptive statistics
| Standard deviation | 29.9047556 |
|---|---|
| Coefficient of variation (CV) | 0.5970790569 |
| Kurtosis | -1.324927188 |
| Mean | 50.08508548 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.01619268276 |
| Sum | 51481658 |
| Variance | 894.2944076 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23 | 22465 | 2.1% |
| 2 | 19372 | 1.8% |
| 26 | 19147 | 1.8% |
| 76 | 18699 | 1.7% |
| 97 | 17630 | 1.6% |
| 11 | 17378 | 1.6% |
| 77 | 16813 | 1.6% |
| 99 | 16459 | 1.5% |
| 22 | 16450 | 1.5% |
| 21 | 16367 | 1.5% |
| Other values (89) | 847104 | |
| (Missing) | 46722 | 4.3% |
| Value | Count | Frequency (%) |
| 1 | 11990 | |
| 2 | 19372 | |
| 3 | 6908 | 0.6% |
| 4 | 9871 | |
| 5 | 6417 | 0.6% |
| 6 | 9931 | |
| 7 | 8813 | |
| 8 | 9695 | |
| 9 | 8779 | |
| 10 | 11103 |
| Value | Count | Frequency (%) |
| 99 | 16459 | |
| 98 | 15458 | |
| 97 | 17630 | |
| 96 | 9644 | |
| 95 | 10474 | |
| 94 | 15613 | |
| 93 | 10263 | |
| 92 | 5872 | 0.5% |
| 91 | 13004 | |
| 90 | 14454 |
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.2 MiB |
| football | 46722 |
|---|---|
| NYJ | 39985 |
| TEN | 39985 |
| ATL | 37499 |
| WAS | 37499 |
| Other values (28) |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 2.974586034 |
| Min length | 2 |
Characters and Unicode
| Total characters | 3196508 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | JAX |
|---|---|
| 2nd row | JAX |
| 3rd row | JAX |
| 4th row | JAX |
| 5th row | JAX |
Common Values
| Value | Count | Frequency (%) |
| football | 46722 | 4.3% |
| NYJ | 39985 | 3.7% |
| TEN | 39985 | 3.7% |
| ATL | 37499 | 3.5% |
| WAS | 37499 | 3.5% |
| DEN | 37389 | 3.5% |
| BAL | 37389 | 3.5% |
| NE | 36641 | 3.4% |
| TB | 36641 | 3.4% |
| ARI | 34936 | 3.3% |
| Other values (23) | 689920 |
Length
| Value | Count | Frequency (%) |
| football | 46722 | 4.3% |
| nyj | 39985 | 3.7% |
| ten | 39985 | 3.7% |
| atl | 37499 | 3.5% |
| was | 37499 | 3.5% |
| den | 37389 | 3.5% |
| bal | 37389 | 3.5% |
| ne | 36641 | 3.4% |
| tb | 36641 | 3.4% |
| ari | 34936 | 3.3% |
| Other values (23) | 689920 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 361955 | 11.3% |
| N | 304293 | 9.5% |
| I | 250404 | 7.8% |
| L | 239085 | 7.5% |
| E | 207328 | 6.5% |
| C | 185350 | 5.8% |
| T | 171391 | 5.4% |
| D | 127204 | 4.0% |
| B | 126049 | 3.9% |
| S | 97229 | 3.0% |
| Other values (20) | 1126220 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2822732 | |
| Lowercase Letter | 373776 | 11.7% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 361955 | |
| N | 304293 | |
| I | 250404 | 8.9% |
| L | 239085 | 8.5% |
| E | 207328 | 7.3% |
| C | 185350 | 6.6% |
| T | 171391 | 6.1% |
| D | 127204 | 4.5% |
| B | 126049 | 4.5% |
| S | 97229 | 3.4% |
| Other values (14) | 752444 |
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 93444 | |
| o | 93444 | |
| a | 46722 | |
| b | 46722 | |
| t | 46722 | |
| f | 46722 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3196508 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 361955 | 11.3% |
| N | 304293 | 9.5% |
| I | 250404 | 7.8% |
| L | 239085 | 7.5% |
| E | 207328 | 6.5% |
| C | 185350 | 5.8% |
| T | 171391 | 5.4% |
| D | 127204 | 4.0% |
| B | 126049 | 3.9% |
| S | 97229 | 3.0% |
| Other values (20) | 1126220 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3196508 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 361955 | 11.3% |
| N | 304293 | 9.5% |
| I | 250404 | 7.8% |
| L | 239085 | 7.5% |
| E | 207328 | 6.5% |
| C | 185350 | 5.8% |
| T | 171391 | 5.4% |
| D | 127204 | 4.0% |
| B | 126049 | 3.9% |
| S | 97229 | 3.0% |
| Other values (20) | 1126220 |
playDirection
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.2 MiB |
| left | |
|---|---|
| right |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.48471812 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4819305 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | right |
|---|---|
| 2nd row | right |
| 3rd row | right |
| 4th row | right |
| 5th row | right |
Common Values
| Value | Count | Frequency (%) |
| left | 553725 | |
| right | 520881 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| left | 553725 | |
| right | 520881 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 1074606 | |
| l | 553725 | |
| e | 553725 | |
| f | 553725 | |
| r | 520881 | |
| i | 520881 | |
| g | 520881 | |
| h | 520881 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4819305 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 1074606 | |
| l | 553725 | |
| e | 553725 | |
| f | 553725 | |
| r | 520881 | |
| i | 520881 | |
| g | 520881 | |
| h | 520881 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4819305 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 1074606 | |
| l | 553725 | |
| e | 553725 | |
| f | 553725 | |
| r | 520881 | |
| i | 520881 | |
| g | 520881 | |
| h | 520881 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4819305 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 1074606 | |
| l | 553725 | |
| e | 553725 | |
| f | 553725 | |
| r | 520881 | |
| i | 520881 | |
| g | 520881 | |
| h | 520881 |
x
Real number (ℝ)
| Distinct | 11870 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 56.16086157 |
| Minimum | -1.9 |
|---|---|
| Maximum | 120 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 30 |
| Negative (%) | < 0.1% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | -1.9 |
|---|---|
| 5-th percentile | 17.24 |
| Q1 | 36.87 |
| median | 55.23 |
| Q3 | 75.08 |
| 95-th percentile | 96.81 |
| Maximum | 120 |
| Range | 121.9 |
| Interquartile range (IQR) | 38.21 |
Descriptive statistics
| Standard deviation | 24.51044829 |
|---|---|
| Coefficient of variation (CV) | 0.4364329109 |
| Kurtosis | -0.8003671711 |
| Mean | 56.16086157 |
| Median Absolute Deviation (MAD) | 19.11 |
| Skewness | 0.109251926 |
| Sum | 60350798.81 |
| Variance | 600.7620755 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 48.49 | 205 | < 0.1% |
| 48.5 | 202 | < 0.1% |
| 35.28 | 199 | < 0.1% |
| 54.66 | 197 | < 0.1% |
| 45.57 | 197 | < 0.1% |
| 52.34 | 197 | < 0.1% |
| 45.95 | 195 | < 0.1% |
| 54.68 | 193 | < 0.1% |
| 34.03 | 193 | < 0.1% |
| 52.17 | 193 | < 0.1% |
| Other values (11860) | 1072635 |
| Value | Count | Frequency (%) |
| -1.9 | 1 | |
| -1.67 | 1 | |
| -1.43 | 1 | |
| -1.17 | 1 | |
| -0.88 | 1 | |
| -0.75 | 1 | |
| -0.74 | 1 | |
| -0.71 | 1 | |
| -0.68 | 1 | |
| -0.64 | 1 |
| Value | Count | Frequency (%) |
| 120 | 2 | |
| 119.99 | 1 | |
| 119.98 | 1 | |
| 119.96 | 1 | |
| 119.95 | 1 | |
| 119.91 | 1 | |
| 119.9 | 1 | |
| 119.86 | 1 | |
| 119.83 | 1 | |
| 119.81 | 1 |
y
Real number (ℝ)
| Distinct | 5391 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.83270419 |
| Minimum | -1.76 |
|---|---|
| Maximum | 54.88 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 14 |
| Negative (%) | < 0.1% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | -1.76 |
|---|---|
| 5-th percentile | 11.83 |
| Q1 | 22.09 |
| median | 26.87 |
| Q3 | 31.56 |
| 95-th percentile | 41.64 |
| Maximum | 54.88 |
| Range | 56.64 |
| Interquartile range (IQR) | 9.47 |
Descriptive statistics
| Standard deviation | 8.239370412 |
|---|---|
| Coefficient of variation (CV) | 0.307064482 |
| Kurtosis | 0.352243321 |
| Mean | 26.83270419 |
| Median Absolute Deviation (MAD) | 4.74 |
| Skewness | -0.01739703146 |
| Sum | 28834584.92 |
| Variance | 67.88722479 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 29.8 | 1050 | 0.1% |
| 29.77 | 1027 | 0.1% |
| 29.74 | 1023 | 0.1% |
| 29.76 | 1021 | 0.1% |
| 29.83 | 1017 | 0.1% |
| 29.75 | 1009 | 0.1% |
| 29.87 | 999 | 0.1% |
| 29.79 | 989 | 0.1% |
| 29.85 | 974 | 0.1% |
| 29.88 | 973 | 0.1% |
| Other values (5381) | 1064524 |
| Value | Count | Frequency (%) |
| -1.76 | 1 | |
| -1.7 | 1 | |
| -1.29 | 1 | |
| -1.21 | 1 | |
| -0.78 | 1 | |
| -0.75 | 1 | |
| -0.69 | 1 | |
| -0.51 | 1 | |
| -0.41 | 1 | |
| -0.31 | 1 |
| Value | Count | Frequency (%) |
| 54.88 | 1 | |
| 54.86 | 1 | |
| 54.85 | 1 | |
| 54.77 | 1 | |
| 54.74 | 1 | |
| 54.64 | 2 | |
| 54.63 | 2 | |
| 54.61 | 2 | |
| 54.59 | 1 | |
| 54.58 | 2 |
| Distinct | 2166 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.604704617 |
| Minimum | 0 |
|---|---|
| Maximum | 27.77 |
| Zeros | 64921 |
| Zeros (%) | 6.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.76 |
| median | 2.15 |
| Q3 | 3.85 |
| 95-th percentile | 6.81 |
| Maximum | 27.77 |
| Range | 27.77 |
| Interquartile range (IQR) | 3.09 |
Descriptive statistics
| Standard deviation | 2.409214602 |
|---|---|
| Coefficient of variation (CV) | 0.924947338 |
| Kurtosis | 14.46556707 |
| Mean | 2.604704617 |
| Median Absolute Deviation (MAD) | 1.51 |
| Skewness | 2.360674215 |
| Sum | 2799031.21 |
| Variance | 5.804314999 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 64921 | 6.0% |
| 0.01 | 16403 | 1.5% |
| 0.02 | 9712 | 0.9% |
| 0.03 | 7255 | 0.7% |
| 0.04 | 5806 | 0.5% |
| 0.05 | 4970 | 0.5% |
| 0.06 | 4570 | 0.4% |
| 0.07 | 4266 | 0.4% |
| 0.08 | 4039 | 0.4% |
| 0.09 | 3869 | 0.4% |
| Other values (2156) | 948795 |
| Value | Count | Frequency (%) |
| 0 | 64921 | |
| 0.01 | 16403 | 1.5% |
| 0.02 | 9712 | 0.9% |
| 0.03 | 7255 | 0.7% |
| 0.04 | 5806 | 0.5% |
| 0.05 | 4970 | 0.5% |
| 0.06 | 4570 | 0.4% |
| 0.07 | 4266 | 0.4% |
| 0.08 | 4039 | 0.4% |
| 0.09 | 3869 | 0.4% |
| Value | Count | Frequency (%) |
| 27.77 | 1 | |
| 27.6 | 1 | |
| 27.36 | 1 | |
| 27.32 | 2 | |
| 27.25 | 1 | |
| 27.22 | 1 | |
| 27.17 | 1 | |
| 27.14 | 1 | |
| 27.11 | 1 | |
| 27.1 | 1 |
| Distinct | 1546 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.804659438 |
| Minimum | 0 |
|---|---|
| Maximum | 27.5 |
| Zeros | 60354 |
| Zeros (%) | 5.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.72 |
| median | 1.55 |
| Q3 | 2.6 |
| 95-th percentile | 4.48 |
| Maximum | 27.5 |
| Range | 27.5 |
| Interquartile range (IQR) | 1.88 |
Descriptive statistics
| Standard deviation | 1.443306065 |
|---|---|
| Coefficient of variation (CV) | 0.7997664462 |
| Kurtosis | 5.95139441 |
| Mean | 1.804659438 |
| Median Absolute Deviation (MAD) | 0.92 |
| Skewness | 1.402756052 |
| Sum | 1939297.86 |
| Variance | 2.083132398 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 60354 | 5.6% |
| 0.01 | 13098 | 1.2% |
| 0.02 | 7404 | 0.7% |
| 0.03 | 5734 | 0.5% |
| 0.04 | 4624 | 0.4% |
| 0.05 | 4004 | 0.4% |
| 1.01 | 3452 | 0.3% |
| 1.2 | 3434 | 0.3% |
| 0.06 | 3416 | 0.3% |
| 1.19 | 3366 | 0.3% |
| Other values (1536) | 965720 |
| Value | Count | Frequency (%) |
| 0 | 60354 | |
| 0.01 | 13098 | 1.2% |
| 0.02 | 7404 | 0.7% |
| 0.03 | 5734 | 0.5% |
| 0.04 | 4624 | 0.4% |
| 0.05 | 4004 | 0.4% |
| 0.06 | 3416 | 0.3% |
| 0.07 | 3120 | 0.3% |
| 0.08 | 2763 | 0.3% |
| 0.09 | 2605 | 0.2% |
| Value | Count | Frequency (%) |
| 27.5 | 1 | |
| 26.97 | 1 | |
| 25.96 | 1 | |
| 24.44 | 1 | |
| 24.24 | 1 | |
| 24.19 | 1 | |
| 23.52 | 1 | |
| 23.41 | 1 | |
| 23.11 | 1 | |
| 23.09 | 1 |
| Distinct | 549 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2636458293 |
| Minimum | 0 |
|---|---|
| Maximum | 8.08 |
| Zeros | 65343 |
| Zeros (%) | 6.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.08 |
| median | 0.22 |
| Q3 | 0.39 |
| 95-th percentile | 0.68 |
| Maximum | 8.08 |
| Range | 8.08 |
| Interquartile range (IQR) | 0.31 |
Descriptive statistics
| Standard deviation | 0.2584563439 |
|---|---|
| Coefficient of variation (CV) | 0.9803164519 |
| Kurtosis | 51.0160923 |
| Mean | 0.2636458293 |
| Median Absolute Deviation (MAD) | 0.15 |
| Skewness | 4.271668359 |
| Sum | 283315.39 |
| Variance | 0.06679968171 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 65343 | 6.1% |
| 0.01 | 58033 | 5.4% |
| 0.02 | 33135 | 3.1% |
| 0.03 | 25486 | 2.4% |
| 0.04 | 22634 | 2.1% |
| 0.05 | 20891 | 1.9% |
| 0.06 | 19922 | 1.9% |
| 0.16 | 19651 | 1.8% |
| 0.18 | 19631 | 1.8% |
| 0.19 | 19500 | 1.8% |
| Other values (539) | 770380 |
| Value | Count | Frequency (%) |
| 0 | 65343 | |
| 0.01 | 58033 | |
| 0.02 | 33135 | |
| 0.03 | 25486 | 2.4% |
| 0.04 | 22634 | 2.1% |
| 0.05 | 20891 | 1.9% |
| 0.06 | 19922 | 1.9% |
| 0.07 | 19442 | 1.8% |
| 0.08 | 19311 | 1.8% |
| 0.09 | 19103 | 1.8% |
| Value | Count | Frequency (%) |
| 8.08 | 1 | |
| 7.25 | 1 | |
| 6.63 | 1 | |
| 6.61 | 1 | |
| 6.48 | 1 | |
| 6.43 | 1 | |
| 6.4 | 1 | |
| 6.26 | 1 | |
| 6.23 | 1 | |
| 6.21 | 2 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 46722 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 182.0093449 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 8 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 32.76 |
| Q1 | 91.38 |
| median | 182.065 |
| Q3 | 270.95 |
| 95-th percentile | 330.83 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 179.57 |
Descriptive statistics
| Standard deviation | 99.07906599 |
|---|---|
| Coefficient of variation (CV) | 0.5443625218 |
| Kurtosis | -1.366207262 |
| Mean | 182.0093449 |
| Median Absolute Deviation (MAD) | 89.775 |
| Skewness | -0.01051411614 |
| Sum | 187084493.5 |
| Variance | 9816.661318 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 173 | < 0.1% |
| 266.72 | 110 | < 0.1% |
| 88.46 | 106 | < 0.1% |
| 265.28 | 105 | < 0.1% |
| 265.4 | 105 | < 0.1% |
| 268.43 | 105 | < 0.1% |
| 265.06 | 105 | < 0.1% |
| 90.85 | 102 | < 0.1% |
| 265.18 | 102 | < 0.1% |
| 87.22 | 102 | < 0.1% |
| Other values (35991) | 1026769 | |
| (Missing) | 46722 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 8 | < 0.1% |
| 0.01 | 21 | |
| 0.02 | 22 | |
| 0.03 | 19 | |
| 0.04 | 19 | |
| 0.05 | 22 | |
| 0.06 | 17 | |
| 0.07 | 11 | |
| 0.08 | 18 | |
| 0.09 | 14 |
| Value | Count | Frequency (%) |
| 360 | 7 | < 0.1% |
| 359.99 | 18 | |
| 359.98 | 17 | |
| 359.97 | 16 | |
| 359.96 | 16 | |
| 359.95 | 12 | |
| 359.94 | 27 | |
| 359.93 | 12 | |
| 359.92 | 25 | |
| 359.91 | 13 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 46722 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 181.6884234 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 14 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 24.05 |
| Q1 | 91.69 |
| median | 181.97 |
| Q3 | 271.59 |
| 95-th percentile | 337.6 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 179.9 |
Descriptive statistics
| Standard deviation | 101.2341019 |
|---|---|
| Coefficient of variation (CV) | 0.5571852075 |
| Kurtosis | -1.283374214 |
| Mean | 181.6884234 |
| Median Absolute Deviation (MAD) | 89.95 |
| Skewness | -0.0120161294 |
| Sum | 186754623.4 |
| Variance | 10248.34338 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 89.02 | 72 | < 0.1% |
| 88.25 | 72 | < 0.1% |
| 261.4 | 69 | < 0.1% |
| 260.4 | 69 | < 0.1% |
| 86.58 | 69 | < 0.1% |
| 100.15 | 69 | < 0.1% |
| 276.72 | 68 | < 0.1% |
| 270.15 | 68 | < 0.1% |
| 89.11 | 68 | < 0.1% |
| 91.07 | 68 | < 0.1% |
| Other values (35991) | 1027192 | |
| (Missing) | 46722 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 14 | |
| 0.01 | 24 | |
| 0.02 | 22 | |
| 0.03 | 24 | |
| 0.04 | 24 | |
| 0.05 | 19 | |
| 0.06 | 21 | |
| 0.07 | 19 | |
| 0.08 | 20 | |
| 0.09 | 24 |
| Value | Count | Frequency (%) |
| 360 | 10 | < 0.1% |
| 359.99 | 30 | |
| 359.98 | 21 | |
| 359.97 | 30 | |
| 359.96 | 31 | |
| 359.95 | 22 | |
| 359.94 | 28 | |
| 359.93 | 34 | |
| 359.92 | 25 | |
| 359.91 | 31 |
event
Categorical
| Distinct | 22 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.2 MiB |
| None | |
|---|---|
| ball_snap | 25484 |
| pass_forward | 22448 |
| autoevent_passforward | 10534 |
| autoevent_ballsnap | 10189 |
| Other values (17) | 12995 |
Length
| Max length | 25 |
|---|---|
| Median length | 4 |
| Mean length | 4.661016224 |
| Min length | 3 |
Characters and Unicode
| Total characters | 5008756 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | None |
|---|---|
| 2nd row | None |
| 3rd row | None |
| 4th row | None |
| 5th row | None |
Common Values
| Value | Count | Frequency (%) |
| None | 992956 | |
| ball_snap | 25484 | 2.4% |
| pass_forward | 22448 | 2.1% |
| autoevent_passforward | 10534 | 1.0% |
| autoevent_ballsnap | 10189 | 0.9% |
| play_action | 6325 | 0.6% |
| run | 1541 | 0.1% |
| qb_sack | 1426 | 0.1% |
| pass_arrived | 1012 | 0.1% |
| man_in_motion | 598 | 0.1% |
| Other values (12) | 2093 | 0.2% |
Length
| Value | Count | Frequency (%) |
| none | 992956 | |
| ball_snap | 25484 | 2.4% |
| pass_forward | 22448 | 2.1% |
| autoevent_passforward | 10534 | 1.0% |
| autoevent_ballsnap | 10189 | 0.9% |
| play_action | 6325 | 0.6% |
| run | 1541 | 0.1% |
| qb_sack | 1426 | 0.1% |
| pass_arrived | 1012 | 0.1% |
| man_in_motion | 598 | 0.1% |
| Other values (12) | 2093 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1060714 | |
| o | 1055309 | |
| e | 1038933 | |
| N | 992956 | |
| a | 176732 | 3.5% |
| s | 108169 | 2.2% |
| _ | 80615 | 1.6% |
| p | 78476 | 1.6% |
| l | 78200 | 1.6% |
| r | 71047 | 1.4% |
| Other values (15) | 267605 | 5.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3935185 | |
| Uppercase Letter | 992956 | 19.8% |
| Connector Punctuation | 80615 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 1060714 | |
| o | 1055309 | |
| e | 1038933 | |
| a | 176732 | 4.5% |
| s | 108169 | 2.7% |
| p | 78476 | 2.0% |
| l | 78200 | 2.0% |
| r | 71047 | 1.8% |
| t | 52532 | 1.3% |
| b | 37352 | 0.9% |
| Other values (13) | 177721 | 4.5% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 992956 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 80615 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4928141 | |
| Common | 80615 | 1.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 1060714 | |
| o | 1055309 | |
| e | 1038933 | |
| N | 992956 | |
| a | 176732 | 3.6% |
| s | 108169 | 2.2% |
| p | 78476 | 1.6% |
| l | 78200 | 1.6% |
| r | 71047 | 1.4% |
| t | 52532 | 1.1% |
| Other values (14) | 215073 | 4.4% |
Common
| Value | Count | Frequency (%) |
| _ | 80615 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5008756 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 1060714 | |
| o | 1055309 | |
| e | 1038933 | |
| N | 992956 | |
| a | 176732 | 3.5% |
| s | 108169 | 2.2% |
| _ | 80615 | 1.6% |
| p | 78476 | 1.6% |
| l | 78200 | 1.6% |
| r | 71047 | 1.4% |
| Other values (15) | 267605 | 5.3% |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021093000 | 169 | 38696.0 | 1 | 2021-10-01T00:29:42.800 | 11.0 | JAX | right | 54.66 | 44.66 | 0.00 | 0.00 | 0.00 | 107.61 | 134.74 | None |
| 1 | 2021093000 | 169 | 38696.0 | 2 | 2021-10-01T00:29:42.900 | 11.0 | JAX | right | 54.66 | 44.66 | 0.00 | 0.00 | 0.00 | 107.61 | 130.79 | None |
| 2 | 2021093000 | 169 | 38696.0 | 3 | 2021-10-01T00:29:43.000 | 11.0 | JAX | right | 54.66 | 44.67 | 0.00 | 0.00 | 0.00 | 107.61 | 122.78 | None |
| 3 | 2021093000 | 169 | 38696.0 | 4 | 2021-10-01T00:29:43.100 | 11.0 | JAX | right | 54.66 | 44.66 | 0.00 | 0.00 | 0.00 | 107.61 | 134.78 | None |
| 4 | 2021093000 | 169 | 38696.0 | 5 | 2021-10-01T00:29:43.200 | 11.0 | JAX | right | 54.66 | 44.66 | 0.00 | 0.00 | 0.00 | 107.61 | 130.76 | None |
| 5 | 2021093000 | 169 | 38696.0 | 6 | 2021-10-01T00:29:43.300 | 11.0 | JAX | right | 54.66 | 44.66 | 0.00 | 0.00 | 0.00 | 107.61 | 141.35 | ball_snap |
| 6 | 2021093000 | 169 | 38696.0 | 7 | 2021-10-01T00:29:43.400 | 11.0 | JAX | right | 54.67 | 44.66 | 0.01 | 0.17 | 0.00 | 108.20 | 145.78 | None |
| 7 | 2021093000 | 169 | 38696.0 | 8 | 2021-10-01T00:29:43.500 | 11.0 | JAX | right | 54.67 | 44.66 | 0.08 | 0.72 | 0.01 | 108.20 | 82.47 | None |
| 8 | 2021093000 | 169 | 38696.0 | 9 | 2021-10-01T00:29:43.600 | 11.0 | JAX | right | 54.69 | 44.67 | 0.20 | 1.16 | 0.02 | 108.99 | 71.68 | None |
| 9 | 2021093000 | 169 | 38696.0 | 10 | 2021-10-01T00:29:43.700 | 11.0 | JAX | right | 54.73 | 44.69 | 0.44 | 1.65 | 0.04 | 108.99 | 64.96 | None |
Last rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1074596 | 2021100400 | 4161 | NaN | 41 | 2021-10-05T03:45:54.700 | NaN | football | right | 17.26 | 23.08 | 6.81 | 3.63 | 0.58 | NaN | NaN | None |
| 1074597 | 2021100400 | 4161 | NaN | 42 | 2021-10-05T03:45:54.800 | NaN | football | right | 17.77 | 22.62 | 7.18 | 3.93 | 0.69 | NaN | NaN | None |
| 1074598 | 2021100400 | 4161 | NaN | 43 | 2021-10-05T03:45:54.900 | NaN | football | right | 18.31 | 22.12 | 7.53 | 3.01 | 0.74 | NaN | NaN | None |
| 1074599 | 2021100400 | 4161 | NaN | 44 | 2021-10-05T03:45:55.000 | NaN | football | right | 18.87 | 21.59 | 7.81 | 2.65 | 0.77 | NaN | NaN | None |
| 1074600 | 2021100400 | 4161 | NaN | 45 | 2021-10-05T03:45:55.100 | NaN | football | right | 19.44 | 21.05 | 8.03 | 1.61 | 0.79 | NaN | NaN | pass_forward |
| 1074601 | 2021100400 | 4161 | NaN | 46 | 2021-10-05T03:45:55.200 | NaN | football | right | 20.03 | 20.49 | 8.14 | 0.63 | 0.81 | NaN | NaN | autoevent_passforward |
| 1074602 | 2021100400 | 4161 | NaN | 47 | 2021-10-05T03:45:55.300 | NaN | football | right | 20.54 | 20.01 | 8.73 | 0.49 | 0.70 | NaN | NaN | None |
| 1074603 | 2021100400 | 4161 | NaN | 48 | 2021-10-05T03:45:55.400 | NaN | football | right | 25.33 | 16.50 | 26.07 | 1.13 | 5.94 | NaN | NaN | None |
| 1074604 | 2021100400 | 4161 | NaN | 49 | 2021-10-05T03:45:55.500 | NaN | football | right | 27.40 | 14.92 | 25.89 | 2.45 | 2.60 | NaN | NaN | None |
| 1074605 | 2021100400 | 4161 | NaN | 50 | 2021-10-05T03:45:55.600 | NaN | football | right | 29.44 | 13.36 | 25.59 | 3.58 | 2.57 | NaN | NaN | None |